Search CORE

3 research outputs found

Generalised Pattern Matching Revisited

Author: Dudek Bart?omiej
Gawrychowski Pawe?
Starikovskaya Tatiana
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020)
Publication date: 01/01/2020
Field of study

In the problem of

\texttt{Generalised Pattern Matching}\ (\texttt{GPM})

[STOC'94, Muthukrishnan and Palem], we are given a text

T

of length

n

over an alphabet

\Sigma_T

, a pattern

P

of length

m

over an alphabet

\Sigma_P

, and a matching relationship

\subseteq \Sigma_T \times \Sigma_P

, and must return all substrings of

T

that match

P

(reporting) or the number of mismatches between each substring of

T

of length

m

and

P

(counting). In this work, we improve over all previously known algorithms for this problem for various parameters describing the input instance: *

\mathcal{D}\,

being the maximum number of characters that match a fixed character, *

\mathcal{S}\,

being the number of pairs of matching characters, *

\mathcal{I}\,

being the total number of disjoint intervals of characters that match the

m

characters of the pattern

P

. At the heart of our new deterministic upper bounds for

\mathcal{D}\,

and

\mathcal{S}\,

lies a faster construction of superimposed codes, which solves an open problem posed in [FOCS'97, Indyk] and can be of independent interest. To conclude, we demonstrate first lower bounds for

\texttt{GPM}

. We start by showing that any deterministic or Monte Carlo algorithm for

\texttt{GPM}

must use

\Omega(\mathcal{S})

time, and then proceed to show higher lower bounds for combinatorial algorithms. These bounds show that our algorithms are almost optimal, unless a radically new approach is developed

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Dagstuhl Research Online Publication Server

Counting 4-Patterns in Permutations Is Equivalent to Counting 4-Cycles in Graphs

Author: Dudek Bart?omiej
Gawrychowski Pawe?
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st International Symposium on Algorithms and Computation (ISAAC 2020)
Publication date: 01/01/2020
Field of study

Permutation ? appears in permutation ? if there exists a subsequence of ? that is order-isomorphic to ?. The natural algorithmic question is to check if ? appears in ?, and if so count the number of occurrences. Only since very recently we know that for any fixed length k, we can check if a given pattern of length k appears in a permutation of length n in time linear in n, but being able to count all such occurrences in f(k)? n^o(k/log k) time would refute the exponential time hypothesis (ETH). Together with practical applications in statistics, this motivates a systematic study of the complexity of counting occurrences for different patterns of fixed small length k. We investigate this question for k = 4. Very recently, Even-Zohar and Leng [arXiv 2019] identified two types of 4-patterns. For the first type they designed an ??(n) time algorithm, while for the second they were able to provide an ??(n^1.5) time algorithm. This brings up the question whether the permutations of the second type are inherently harder than the first type. We establish a connection between counting 4-patterns of the second type and counting 4-cycles (not necessarily induced) in a sparse undirected graph. By designing two-way reductions we show that the complexities of both problems are the same, up to polylogarithmic factors. This allows us to leverage the work done on the latter to provide a reasonable argument for why there is a difference in the complexities for counting 4-patterns of the first and the second type. In particular, even for the seemingly simpler problem of detecting a 4-cycle in a graph on m edges, the best known algorithm works in ?(m^{4/3}) time. Our reductions imply that an ?(n^{4/3-?}) time algorithm for counting occurrences of any 4-pattern of the second type in a permutation of length n would imply an exciting breakthrough for counting (and hence also detecting) 4-cycles. In the other direction, by plugging in the fastest known algorithm for counting 4-cycles, we obtain an algorithm for counting occurrences of any 4-pattern of the second type in ?(n^1.48) time

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Optimal Near-Linear Space Heaviest Induced Ancestors

Author: Charalampopoulos Panagiotis
Dudek Bart?omiej
Gawrychowski Pawe?
Pokorski Karol
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)
Publication date: 01/01/2023
Field of study

Dagstuhl Research Online Publication Server